The Development of Lexical Resources for Information Extraction from Text Combining WordNet and Dewey Decimal Classification
نویسنده
چکیده
Lexicon definition is one of the main bottlenecks in the development of new applications in the field of Information Extraction from text. Generic resources (e.g., lexical databases) are promising for reducing the cost of specific lexica definition, but they introduce lexical ambiguity. This paper proposes a methodology for building application-specific lexica by using WordNet. Lexical ambiguity is kept under control by marking synsets in WordNet with field labels taken from the Dewey Decimal Classification.
منابع مشابه
Automatic Construction of Persian ICT WordNet using Princeton WordNet
WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...
متن کاملA Semantic Method to Information Extraction for Decision Support Systems
In this paper, we describe a novel schema for a more semantic text mining process which results in more comprehensive decision making activity by decision support systems via providing more effective and accurate textual information. The utility of two semantic lexical resources; FrameNet and WordNet, in extracting required text snippets from unstructured free texts yields a better and more acc...
متن کاملSemantic Extraction with Wide-Coverage Lexical Resources
We report on results of combining graphical modeling techniques with Information Extraction resources (Pattern Dictionary and Lexicon) for both frame and semantic role assignment. Our approach demonstrates the use of two human built knowledge bases (WordNet and FrameNet) for the task of semantic extraction.
متن کاملExtracting Lexico-conceptual Knowledge for Developing Persian WordNet
Semantic lexicons and lexical ontologies are some major resources in natural language processing. Developing such resources are time consuming tasks for which some automatic methods are proposed. This paper describes some methods used in semi-automatic development of FarsNet; a lexical ontology for the Persian language. FarsNet includes the Persian WordNet with more than 10000 synsets of nouns,...
متن کاملGenerating Training Data for Semantic Role Labeling based on Label Transfer from Linked Lexical Resources
We present a new approach for generating role-labeled training data using Linked Lexical Resources, i.e., integrated lexical resources that combine several resources (e.g., WordNet, FrameNet, Wiktionary) by linking them on the sense or on the role level. Unlike resource-based supervision in relation extraction, we focus on complex linguistic annotations, more specifically FrameNet senses and ro...
متن کامل